Minimizing Calibrated Loss using Stochastic Low-Rank Newton Descent for large scale image classification

نویسندگان

  • Wafa BelHajAli
  • Richard Nock
  • Michel Barlaud
چکیده

A standard approach for large scale image classification involves high dimensional features and Stochastic Gradient Descent algorithm (SGD) for the minimization of classical Hinge Loss in the primal space. Although complexity of Stochastic Gradient Descent is linear with the number of samples these method suffers from slow convergence. In order to cope with this issue, we propose here a Stochastic Low-Rank Newton Descent (SLND) for minimization of any calibrated loss in the primal space. SLND approximates the inverse Hessian by the best low-rank approximation according to squared Frobenius norm. We provide core optimization for fast convergence. Theoretically speaking, we show explicit convergence rates of the algorithm using these calibrated losses, which in addition provide working sets of parameters for experiments. Experiments are provided on the SUN, Caltech256 and ImageNet databases, with simple, uniform and efficient ways to tune remaining SLND parameters. On each of these databases, SLND challenges the accuracy of SGD with a speed of convergence faster by order of magnitude.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite number of loss functions. The present paper proposes a Riemannian stochastic quasi-Newton algorithm with variance reduction (R-SQN-VR). The key challenges of averaging, adding, and subtracting multiple gradients are addressed with notions of retraction and vector transport. We...

متن کامل

Scalable Metric Learning via Weighted Approximate Rank Component Analysis

Our goal is to learn a Mahalanobis distance by minimizing a loss defined on the weighted sum of the precision at different ranks. Our core motivation is that minimizing a weighted rank loss is a natural criterion for many problems in computer vision such as person re-identification. We propose a novel metric learning formulation called Weighted Approximate Rank Component Analysis (WARCA). We th...

متن کامل

Fast Probabilistic Optimization from Noisy Gradients

Stochastic gradient descent remains popular in large-scale machine learning, on account of its very low computational cost and robustness to noise. However, gradient descent is only linearly efficient and not transformation invariant. Scaling by a local measure can substantially improve its performance. One natural choice of such a scale is the Hessian of the objective function: Were it availab...

متن کامل

The geometry of weighted low-rank approximations

The low-rank approximation problem is to approximate optimally, with respect to some norm, a matrix by one of the same dimension but smaller rank. It is known that under the Frobenius norm, the best low-rank approximation can be found by using the singular value decomposition (SVD). Although this is no longer true under weighted norms in general, it is demonstrated here that the weighted low-ra...

متن کامل

Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods

We present an algorithm for minimizing a sum of functions that combines the computational efficiency of stochastic gradient descent (SGD) with the second order curvature information leveraged by quasi-Newton methods. We unify these disparate approaches by maintaining an independent Hessian approximation for each contributing function in the sum. We maintain computational tractability and limit ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013